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ABSTRACT 



Context. Most of the X-ray background (XRB) is generated by discrete X-ray sources. It is likely that still unresolved fraction of the 
XRB is composed from a population of the weak sources below the present detection thresholds and a truly diffuse component. It is a 
matter of discussion a nature of these weak sources. 

Aims. The goal is to explore the effectiveness of the nearest neighbor statistics (NNST) of the photon distribution for the investigation 
of the number counts of the very weak sources. 

Methods. All the sources generating at least two counts each induce a nonrandom distribution of counts. This distribution is analyzed 
by means of the NNST. Using the basic probability equations, the relationships between the source number counts N(S) and the 
NNST are derived. 

Results. It is shown that the method yields constraints on the N(S ) relationship below the regular discrete source detection threshold. 
The NNST was applied to the medium deep Chandra pointing to assess the source counts N(S) at flux levels attainable only with the 
very deep exposures. The results are in good agreement with the direct source counts based in the Chandra Deep Fields (CDF). 
Conclusions. In the next paper of this series the NNST will be applied to the the CDF to assess the source counts below the present 
flux limits. 

Key words. X-rays: diffuse background - X-rays: general 



1. Introduction 

The X-ray background ( XRB) is mostly generated by discrete 
extragalactic sources (e.g. Leh mann et alj200lllKim et al . 2007, 
and references therein). Thus, the source counts provide the es- 
sential information on the constituents of the XRB and the X-ray 
N(S) relationship has been a subject of numerous studies for the 
last 40 years. 

The individual point-like source is detected if a number of 
counts within a specified area exceeds the assumed threshold. 
Size of the detection box is defined by the Point Spread Function 
(PSF), while the detection threshold is usually selected to min- 
imize number of false detections and at the same time to maxi- 
mize number of the real sources. The detection threshold is typ- 
ically set at the level of 4 - 5 <x above the local average count 
density. A presence of weaker sources, below the formal detec- 
tion threshold, is manifested by the increased fluctuations of the 
count distribution as compared to the fluctuations expected for 
the random counts. 

A common approach to assess counts of sources weaker than 
the detection limit is based on the count density fluctuation anal- 
ysis. To quantify signal generated by the discrete sources one 
should determine the intensity distribution P(D), i.e. the his- 
togram of the number of pixels as a function of the number of 
counts. The observed function P(D) is then compared with the 
functions obtained from the s imulated co unt di stributions (e.g. 
lHasingeret all fl993t iMivaii & Griffith! [2002). It is assumed 
that the simulated source counts represent the actual source dis- 
tribution if the model P(D) function mimics the observed his- 
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togram. A contribution of point sources to the count distribution 
one can estimate also using the auto-correlation function (ACF). 
Since the integral of the ACF is direc tly related to the second 
moment of the P(D) distribution, (e.g. lSoitanlll991l) both meth- 
ods are closely related. 

An innovativ e method to assess the nu mber of weak sources 
was proposed by Georgakakis et al.l d2008l) . In their approach the 
count distribution in the detection cell is explicitly expressed as 
a sum of the source and background counts. As a result of a 
rigorous application of the Poisson statistics, a flux probability 
distribution is derived as a function of the total and background 
counts observed in the detection cell. This probability distribu- 
tion combined with the adequately defined sensitivity map of a 
given observation is then used to estimate the the source number 
counts. 

The count fluctuations are proportional to the source inten- 
sities. Thus, the observed fluctuation amplitude is dominated 
by the sources just below the detection threshold set for the 
individual objects, whereas it is only weakly sensitive to the 
fainter sources which produce smaller number of counts. One 
should note, however, that every source which produces more 
than one count generates deviation from the random count dis- 
tribution. In the present paper we investigate the efficiency of 
the nearest neighbor statistics (NNST) for the weak source anal- 
ysis and we show that the NNST is a powerful tool to esti- 
mate the source counts, N(S), down to very low flux levels. 
We apply this technique to one of the Chandra AEGIS fields 
with the exposure of 465 ks. The NNST allows us to obtain 
the N(S) relationship extending down to 2 x 10 _17 ergcm _2 s _1 
and 7 x 10~ 17 erg ctrrV'in the 0.5 - 2keV and 2 - 8keV en- 
ergy bands, respectively, i.e. a factor of 5 - 10 below the stan- 
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dard sensitivity threshold corresponding to this exposure. Since 
the present count estimates are contained within the flux range 
covered by the direct source counts derived from the deepest 
Chandra fields (such as CDFS), the effectiveness of the NNST 
method could be directly assessed. 

The organization of the paper is as follows. In the next sec- 
tion, the method and all the relevant formulae are presented. 
Then, in Sect. [3] the observational material is described and the 
computational details including questions related to the PSF are 
given. Results of the calculations, i.e. estimates of the source 
counts below the nominal sensitivity limit are presented in 
Sect. 4. The results are summarized and discussed in Sect. 5. 
Prospects for the application of the NNST to the deep Chandra 
fields are presented. 

2. The nearest neighbor statistics 

In this section a general formulae based on the theory of prob- 
ability are derived, while all the details related to the actual ob- 
servations (e.g. relationship between counts and photon energy, 
instrument sensitivity) are discussed in Sect. [3] 

In the present consideration we assume that counts in the 
given Chandra ACIS3 observation are distributed according to 
the following model: some a priori unknown fraction of counts is 
randomly distributed, i.e. the positions of the counts are fully de- 
scribed by the Poisson statistics, while all the remaining photons 
are clustered in the point-like sources. Such model corresponds 
to the observation obtained using the idealized telescope with- 
out vignetting and detector with perfectly uniform response over 
the field of view. The real telescope-detector combination intro- 
duces numerous deformations to this ideal picture. We address 
this question below. 

We further assume, that the positions of sources are also ran- 
domly distributed. Smoothly distributed counts constitute phys- 
ically heterogeneous collection which contains both the parti- 
cle background and various components of the foreground X- 
ray emission including scattered solar X-rays, the geocoronal 
oxygen lines as well a s the thermal emission of hot plasma 
within our Galaxy (e.g. iHasingerl 119921: iGaleazzi et all 120071: 
Henlev et all 12007) and the emission by the WHIM (e.g.lSoItanL 
20071 Truly diffuse extragalactic background (lSoltanll2003h and 
weak discrete sources, each producing in the final image exactly 
one photon, contribute also to these counts. A separation of the 
extragalactic component from all the counts can be achieved in 
statistical terms using spectral information, however, individual 
event cannot be definitely classified as local or extragalactic. 

Discrete sources in the deep X-ray exposures are predom- 
inantly extragalactic. Nevertheless, galactic sources are poten- 
tially present in the data and are included in the calculations. 
Photons coming from a discrete sources are distributed in the 
image in clumps defined by the PSF of the telescope. 

Within the present model, statistical characteristics of the 
count distribution are fully defined: there are two population of 
events, the first is randomly distributed and the second is con- 
centrated in the PSF shaped clusters. This feature of the count 
distribution is conveniently formulated using the NNST. Let n t 
denotes the total number of counts in the investigated field, n\ - 
the number of events distributed randomly, or "single photons", 
«2 - the number of photons due to sources producing each ex- 
actly 2 photons, «3 - number of photons due to "three photon" 
sources and so on. Thus, 

n\ + «2 + ■■■ + tlk + ■•■ + n k m . a — "t , (1) 

1 In all the analysis we use ACIS-I chips 0-3. 



where the left hand side sum extends over all the sources and 
&max is the number photons produced by the brightest source in 
the field. The number of photons is related to the number of 
sources in an obvious way: kN(k) = rtk, where N(k) denotes the 
number of "£-photon sources". 

Using the basic relationships of the probability theory, one 
can calculate P(r) - the probability that the distance to the near- 
est neighbor of a randomly chosen event exceeds r. 

Pl P(r\ 1) + Pl P(r\2) + ... + Pkmi ,P(r\k max ) = P(r) , (2) 

where pt denotes the probability that the randomly chosen pho- 
ton is produced by the "fc-photon source" (k = 1, 2, k m . dx ), 
and P(r \ k) is the conditional probability that there are no other 
counts within r provided the selected event belongs to the k- 
photon source. 

The probability P(r) can be estimated for the given distribu- 
tion of counts by measuring the distance to the nearest neigh- 
bor for the each photon in the field. Similarly, P(r | 1) is given 
by the distribution of distances to the nearest photon from the 
randomly distributed points. Assuming that the distribution of 
"single counts" is not correlated with the distribution of photons 
from k > 2 sources, and that sources are distributed randomly, 
the probability P(r \ k) for k > 2 are related to P(r | 1) and to the 
PSF by the expression: 

P(r\k)=P(r\l)-P(r\k), (3) 

where P(r \ k) is the probability that the distance from the ran- 
domly chosen photon produced by the fc-photon source to its 
nearest neighbor from the same source exceeds r. This quantity 
is fully defined by the PSF. 

To estimate the probabilities pi we now use a ratio n^/nt and 
the Eq. [2] takes the form: 

-P(r\l) + V - P(r\\)T(r\k) = P(r) . (4) 

It is worth to note that Eq.|4]is linear in the photon counts tik, 
what makes it particularly suitable for the estimates of the contri- 
bution of weak sources to the total counts. Both the second mo- 
ment of the count distribution in pixels and the autocorrelation 
function depend on squares of the photon counts. Substituting 
successive values of r into Eq.|4] a set of linear equations is con- 
structed which allows us to estimate the unknown counts n^. 

In the deep Chandra observation a large number of relatively 
strong sources is detected and the range of source fluxes is much 
too wide to apply Eq.|4]in the form given above where the source 
fluxes are listed consecutively from k — 2 up to some maximum 
value of k m . dx representing the strongest source in the field. Since 
we are interested in the counts at the faint end (down to k = 2), 
the value of k m . dx is selected at the conventional detection limit. 
All the sources above this threshold are pinpointed and removed 
from the data and the subsequent analysis is concentrated on the 
sources which cannot be individually recognized. 

One can express the number of photons due to fc-photon 
sources by means of the differential source counts N(S): 

n k = k J dS N(S)¥(k\S) , (5) 

where <$(£ | S) is the probability, that the source generating flux 
S delivers fe-photons, while the integration limits S m ; n and 5 max 



A. M. Soltan: The nearest neighbor statistics for X-ray source counts I.The method 



3 



define the full range of the source fluxes. It is convenient to intro- 
duce the instrumental count as a flux unit. The flux s expressed 
in the ACIS counts in the definite observation is related to the 
flux in physical units, S , by: 



Table 1. The Chandra AEGIS observations used in the paper 



s = S/cf, 



(6) 



where cf is the conversion factor which has units of 
"ergcrrT 2 s~7 co unt" and is related to the parameter "exposure 
map" defined in a standard processing of the ACIS dataj: ex- 
posure map = cf ■ (E), where (E) denotes the average photon 
energy. 

For the power law source counts, N(s) = N s , we have: 



= N 



Tjk-b+l, s min ) -V(k-b+\, s max ) 
T(k) 



(7) 



If the slope of the counts b is constant over a sufficiently 
wide range of fluxes (i.e. s m ; n << 2 and s max > £ max ), one might 
replace the integration limits s m \ n and s max by and oo, respec- 
tively, to get: 



n k = N 



T(k-b+\) 
W) 



(8) 



Thus, for the source counts represented by a single power 
law over sufficiently wide range of fluxes, Eq.[4]takes the form: 



— P(r\l) + — ) 

n t ^ 



^ T(k - b + 1) 



r(k) 



P(r|l)!P(r|/fc) = P(r) . (9) 



The extension of this expression for the source counts with 
varying slope is straightforward. In particular, for the broken 
power law, the function F(k - b + 1) is replaced by a proper 
combination of the incomplete gamma functions. Substituting 



2 

k=2 



(10) 



and using Eq.[8]we finally get: 



— P(r\l)Y r(k ~ b .. + l) [l-nr\k)]=P(r\l)-P(r). (11) 
U m 

Equation Q~T] contains two parameters which fully describe 
the power law source counts in the investigated flux range, viz. 
the normalization N„ and the slope b. Since the normalization at 
the flux s = £ max is defined by the actual source counts above 
this threshold, only the slope remains unknown. 

3. Observational material 

To test efficiency of the present method I have selected a set of 16 
close Chandra pointings withing the AEGIS0. The observations 
span a period of 6 months and the data have been processed in 
a uniform way with the recent pipeline processing versions. The 
details of 16 observations used in the present paper are given in 
Table Q] All the exposures have been scrutinized with respect to 
the background flares and only "good time intervals" were used 
in the subsequent analysis. The data have been split into two 
energy bands: S - soft (0.5 - 2 keV) and H - hard (2-8 keV). 



2 See http://asc.harvard.edu/ciao For the real observations, both the 
cf and exposure map are functions of the position. At this stage these 
parameters are assumed constant. 

3 All-wavelength Extended Groth strip International Survey, see 
http://aegis.ucolick.org/index.html 



Obs. 


Observation and processing 


Processing 


bxposure 


ID 


dates 


version 


time [s] 


9450 


0HPT7 10 11 
ZUU/-1Z-1 1 


ZUU / -1Z- 1 J 


7.6.11.3 


29100 


9451 


"if\cyi 10 i f. 

zuu/-iz-ro 


ZUUJ5-U1-UZ 


7.6.11.4 


25350 


9793 


0OO7 10 1 Q 


ZUU / -1Z-Z1 


7.6.11.4 


44750 


9725 


onnc m 1 1 
ZUUo-UjoI 


ZUU0-U4-UZ 


7.6.11.6 


28050 


9842 


onns r\A no 

ZUUo-U^f-UZ 


ZUUo-U^-Uj 


7.6.11.6 


19450 


9844 


ZUUo-U4-Uj 


inno r\/\ r\f. 
ZUU0-U4-UD 


7.6.11.6 


34600 


9866 


2008-06-03 


2008-06-05 


7.6.11.6 


31450 


9726 


2008-06-05 


2008-06-06 


7.6.11.6 


39750 


9863 


2008-06-07 


2008-06-07 


7.6.11.6 


24100 


9873 


2008-06-11 


2008-06-12 


7.6.11.6 


30850 


9722 


2008-06-13 


2008-06-15 


7.6.11.6 


19900 


9453 


2008-06-15 


2008-06-17 


7.6.11.6 


22150 


9720 


2008-06-17 


2008-06-18 


7.6.11.6 


26000 


9723 


2008-06-18 


2008-06-20 


7.6.11.6 


30950 


9876 


2008-06-22 


2008-06-23 


7.6.11.6 


25050 


9875 


2008-06-23 


2008-06-25 


7.6.11.6 


33150 








Total exposure 464650 



3.1. The exposure map 

The observations, listed in Table Q] were merged to create a sin- 
gle count distribution and exposure map. A circular area covered 
by all the pointings with a relatively uniform exposure, centered 
at RA = 14 h 20 m 12 s , Dec = 53°00' with radius of 6'.0 has been 
selected for further processing. The exposure map of the individ- 
ual observation resulting from various instrumental characteris- 
ticfl has a complex structure. In effect, the exposure map of the 
merged observation is even less regular and is devoid of any clear 
symmetries. On the other hand, a rough texture of the individual 
exposure is reduced and smoothed in a sum of 16 components. 
To reduce further the variations of the exposure map over the in- 
vestigated area, a threshold of the minimum exposure has been 
set separately for both energy band. Pixels below this threshold 
have not been used in the calculations. 

A threshold has been defined at ~ 75 % of the maximum 
value of the exposure map for the both energy bands. In effect, 
the maximum deviations of the exposure from the average value 
do not exceed 15 % and 18 % in the bands S and H, respectively, 
and the corresponding rms of the exposures amount to 5.9 and 
6.2 %. In Table |2]the conversion factors calculated using the rel- 
evant amplitudes of the exposure maps are given. In the calcu- 
lations "from counts to flux" a power spectrum with a photon 
index F p h = 1.4 was assumed (Ki m et aUl2007l) . 



Variations of the conversion factor, cf, over the investigated 
area alter the source fluxes via Eq. [6] and, consequently, the 
source counts N(S). For the power law counts the cf uncertainty 
affects the counts normalization and does not change the slope. 
It is shown in the Appendix that - as long as the cf variations 
remain small - they modify the probability distributions P(r\ 1) 
and P(r) in such a way that the solution of Eq.QT|is n °t affected. 

3.2. The Point Spread Function 

The Chandra X-ray telescope PSF is a comple x function of 
source position and energy (e.g. lAllen etafll200l . However, to 
compute the nearest neighbor probability distribution for counts 
generated by a point-like source, V(r \ k), we do not need a full 



4 To name the most obvious: vignetting, gaps between chips, tele- 
scope wobbling and all kinds of chip imperfections. 



4 A. M. Soltan: The nearest neighbor statistics for X-ray source counts I.The method 

Table 2. Energy bands and conversion factors 



Ener 


gy band 




Conversion factors" 




[keV] 


Average 


rms minimum maximum 


S 


0.5-2 


1.469 


0.087 1.279 1.687 


H 


2-8 


5.546 


0.346 4.862 6.511 



" The conversion factor (cf) has units of 10 17 (erg-cm 2 -s ')/count. 

model of the PSF shape. The aim is to find a convenient ana- 
lytic PSF approximation adequately reproducing the NNST over 
the investigated area for each energy band. The model should 
be applicable to the merged AEGIS observations processed in a 
standard way. 

To effectively compute the f{r\k) we first construct a model 
PSF. Then, the relevant probability distributions are obtained us- 
ing the Monte Carlo method. We note that the analytic form of 
the PSF should mimic the radial distribution of counts, while 
some deviations from the azimuthal symmetry are of lesser im- 
portance. Several simple analytic models have been tested to fit 
the observed distribution of counts and it was found that a func- 
tion of the form 



/(< r) = 



z + r a + y ■ r 



all 



(12) 



adequately represents the encircled count fraction (ECF), where 
a, z, and y are parameters depending on the source position and 
energy. Fits of Eq. Q~2] to the observed count distribution for 
several brightest sources have allowed us to find simple rela- 
tionships between these parameters and the off-axis angle. The 
whole procedure looks as follows. We have noticed that the PSF 
parameters a, y, and log z are satisfactorily approximated by lin- 
ear functions of the off-axis angle 8: 



a — a a ■ 6 + b a y — a y ■ 6 + b y log z — a z ■ 6 + b z 



(13) 



where a s and b s (s - a, y, z) are six parameters which are sub- 
stituted in Eq.[T2]and simultaneously fitted to the observed dis- 
tribution of counts in a number (25 and 3 1 in the S and H band, 
respectively) of the strongest point-like sources. In Fig.[T]a sam- 
ple of the resultant fits to the observed distributions in the S band 
is shown. Since not all the fits are of equal quality, the effects of 
the P(r | k) approximation are carefully examined. As a good 
envelope of errors generated by the imperfections of our fitting 
procedures we have constructed two model P(r\k) distributions 
using the ECF functions systematically wider and narrower by 
15 % as compared to the best fit. 

Example results of this procedure are illustrated in Fig|2j 
where the envelope ECFs for two sources at 3'2 and 6.'1 off axis 
angle are shown. Although some deviations are quite large, the 
observed ECFs are predominantly contained within the ±15 % 
distribution in a wide range of the separations r. 

In the observational material used in the present investiga- 
tion the maximum separations found in the NNST very rarely 
exceed 2". Thus, our fits appear adequate for the NNST and it is 
assumed that the +15 % limits (indicated by the dotted curves in 
Fig|2| define the maximum systematic errors associated with the 
PSF fitting procedure and will be used to assess uncertainties in 
our faint source calculations. 

In the Monte Carlo computations of P(r\k) a population of 
10 8 "sources" of k — 2, 3, i ma , counts were distributed ran- 
domly over the investigated areo The distribution of counts 




Fig. 1. Encircled count fraction (ECF) as a function of distance 
from the count centroid. Example distributions are shown for 
4 sources at l'.Q, 3 '2, 5'.Q, and 6'1 from the field center. Solid 
curves - observed count distributions, dashed curves - fits ob- 
tained using Eqs.[T2land[T3l 
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5 The value of k mm is related the sensitivity threshold and is different 
for each energy band; see below. 



r [arc sec] 

Fig. 2. ECF as in Fig. [T]for sources at 3 '2 and 6.'1 from the 
field center. Solid curves - observed count distributions, dashed 
curves - best fits obtained using Eqs.[T2]and[T3] dotted curves - 
the ECF distributions with the radius r scaled by +15 %. 



within each source was randomized according to the model ECF. 
Then, for each source a distribution of the nearest neighbor sep- 
arations was determined and used to obtain the corresponding 
amplitudes of l P(r\k). The procedure has been executed for the 
best fit and ±15 % ECF distributions. 



3.3. The "afterglow" correction 

A charge deposited by a cosmic-ray in the ACIS CCD detec- 
tor may be released in two or more time frames generating a 
sequence of events, so-called "afterglow". The events span typ- 
ically several seconds and do not need to occur in consecutive 
frames 0. As a result, the data contain clumps of counts, which 



See http://cxc.harvard.edu/ciao/why/afterglow.html for details. 
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mimic very weak sources. Fortunately, the time sequences of 07 r 

such afterglow counts span short time intervals as compared to 

exposure times of all the observations. This allows us to un- o.oe \- 



ambiguously identify practically all the afterglows and remove 
them from the observation. 

3.4. Pixel randomization 

Positions of counts in the standard ACIS processing are random- 
ized within the instrument pixel approximated by a square (X'492 
a side. Since the typical nearest neighbor separations are com- 
parable to the pixel size, the NNST at small angular scales is 
smoothed by the count randomization. The effect is significant 
and generates most of the statistical noise in our calculations. To 
assess uncertainties introduced by the count randomization, 12 
sets of randomized observations were produced using the orig- 
inal event data with non-randomized (integer) counts positions. 
Then, the NNST was determined for each observation and the 
data were used to obtain the slope b of the source number counts 
by means of Eq. QT| A scatter between the 12 count slope esti- 




mates represents the statistical error of the present method. o i 2 



3.5. Strong source removal 

To maximize effect of the weak source population on the NNST, 
the strong sources should be removed from the data. The thresh- 
old source flux is defined by k mdx counts in the Eq. QT| Thus, the 
value of £ max should be set at a level sufficiently large to ensure 
that all the sources producing more counts than fc max are found 
using standard source search criteria. On the other hand, & max 
should be small enough to warrant that the number of sources is 
adequately represented by the function N{s). 

A catalog of p oint sources in the AEGIS field is given by 
lLaird et all d2009l) . For each source listed in that paper, a radius 
r$$ enclosing 85 % of counts has been calculated using the PSF 
at the source position. Then, several trial values of £ max were 
applied to assess completeness threshold in the investigated area. 
It was found that for k max = 20 all the brighter sources are clearly 
recognized, while this value is low enough to allow for statistical 
treatment of the population of still weaker sources. 



4. The source counts 

4. 1 . The soft band 

Using the selection criteria given in Sect. [3j the area of the field 
and the total number of counts in the soft band amount to to 97.5 
sq. arcmin and 7371 1, respectively. After the removal of strong 
sources according to the procedure described above, the area is 
reduced to 92.7 sq. arcmin and the number of counts to 48737. 
The average count density amounts to 0.146 per sq. arcsec. and 
the average distance to the nearest neighbor for the random dis- 
tribution is equal tol'.'22. To eliminate a contamination of the 
count distribution by strong source photons in the PSF wings, 
the removal radius of ~ 4 ■ was applied. 

The calculations have been performed as follows. First, the 
count distributions were used to calculate the P(r) and P(r | 1) 
probabilities. The distribution of a distances to the nearest neigh- 
bor for each count defines the P{r), while the distribution of the 
distances between the randomly distributed points and the near- 
est observed count is used to determine P(r | 1). The NNST has 
been formulated in the Sect. [2] using the cumulative probabili- 
ties, and estimates of the count slope b obtained from Eq.fTTIfor 



Fig. 3. The nearest neighbor probability distributions binned 
with Ar = O'.'l for the observed counts (solid histogram), and 
between random and observed counts (dotted histogram); error 
bars in the lower panel show lcr uncertainties. 



different separations r are dependent. To obtain a set of indepen- 
dent equations, we use the differential probability distributions 
AP(r) = P(r) - P(r + AR). In the calculations the relevant prob- 
ability distributions have been obtained over the hole range of 
the observed separations with Ar = O'.'l. Then, using the Least 
Square method the best fit value of the count slope b has been 
determined for each of the randomized distributions. Finally, the 
average and rms of b was calculated. The rms obtained in this 
way results only from the randomization of counts. 

To facilitate comparison of the present results with 
those available in the literature, we have adopted from 
IGeorgakakis et al.1 (|2008) the number counts model for the bright 
sources. In effect, we fixed the normalization and the number 
counts slope in the flux range not covered by the present anal- 
ysis, i.e. at fluxes above S = 2.9 ■ 10~ 16 cgs or £ max = 20. The 
present calculations provide the slope best estimate below that 
flux down to the level determined by the sources generating 2 
counts. The average flux of those sources depends on the number 
counts slope and within the power law approximation amounts 
to S w « (2 — b) ■ cf, where cf — 1 .469 • 10~ 17 cgs is the average 
conversion factor (see Table 2). 

The probability distributions AP(r) and AP(r | 1) averaged 
over 12 data sets are presented in the upper panel of Fig. [3] A 
histogram in the lower panel shows the difference between the 
both distributions; the error bars indicate the histogram rms un- 
certainties obtained from the 12 pixel randomization. Dots show 
the model histogram obtained for the NNST best solution slope 
ofb= 1.595. 

In Fig. g] the results based on the NNST for the S band 
ar e superimposed on the di fferential number counts presented 
by IGeorgakaki s et al.1 d2008h . The counts are normalized to the 
Euclidean slope of - 2.5. Full dots and dotted line denote the 
counts and mo del by IGe orgakaki s et al.l (120081) . open circles - 
the counts by Kim et al.l (|2007|). and th e dot-d ash curve - the 
predicted AGN counts from lUeda et all d2003l) . The thick line 
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Fig. 4. Differential number counts in the 0.5 - 2 keV band nor- 
malized to the Eucli dean slope. The da t a po ints and the dotted 
line are taken from Georgakakis et al.l d2008l) : dot-dash curve 
shows the AGN model by lUeda et all (120031) : solid lines - the 
number counts estimates based on the NNST: thick solid line - 
the best power-law fit, thin solid lines - Icr uncertainty range 
due to the counting statistics; dashed lines - maximum total 
uncertainty including potential errors generated by inaccura- 
cies of the PSF fitting. The count normalization at the bright 
end of the NNST mode l is fixed at the amplitude given by 
Georgakakis et al. 1 (120081) approximation. 



shows the NNST best fit solution with the slope b = 1.595 
and the normalization fixed at S = 2.9 ■ 10~ 1 6 cgs. The number 
counts above that flux are described by the Georgakakis et al. 
(2008) model, i.e. over a wide range of fluxes the counts are 
approximated by the power law with the differential slope of 
-1.58. It is clear that the NNST solution is in full agreement 
with the direct source counts derived from the deep Chandra ex- 
posures. One should emphasize that the present NNST solution 
is based on the relatively shallow exposure as compared to the 
Georgakakis et al. (2008) data. In fact, the sources at the low flux 
end of the present solution on the average generate in our data 
essentially less than 2 counts each. Evidently, the NNST is capa- 
ble to provide a sensitive estimation method for the population 
of sources which cannot be recognized as individual entities. It 
is worth to note that counts due to the sources generating the sig- 
nal, i.e. producing 2 < k < 20 photons, constitute a small frac- 
tion of all the counts. In the present model just ~ 1080 counts or 
2.2 % comes from these sources while the remaining 97.8 % is 
distributed randomly. 

4.2. Error estimates 

A small fractional contribution of counts produced by sources to 
the total number is to be blamed for the rather large statistical un- 
certainties of the NNST method. In our case the rms of the slope 
b in the S band amounts to 0.230. The situation is even worse in 
the H band, where the rms uncertainty of the slope reaches 0.34 
(see below). 



A question of systematic errors is less straightforward. 
Several potential effects could influence our estimate of the 
count slope. The first one, already discussed in Sect. 13.21 is 
associated with our PSF approximation. The complex shape 
and intricate position dependence of the PSF demands ap- 
proximate treatment. Our PSF fits inevitably introduce errors. 
Unfortunately, an amplitude of these errors is difficult to assess. 
Because of that we adopted a cautious approach to this problem. 
Alongside the best fit PSF model we have considered two ancil- 
lary sets of PSFs which delineate the observed count distribution 
of the observed strong sources and also confine possible devia- 
tions introduced by our simplified model of the PSF variations 
over the field of view. Visual comparison of the actual count dis- 
tributions and our model PSF shows that the +15 % modification 
of the PSF width account for any potential deficiencies of our 
PSF calculations. 

The +15 % uncertainty of the PSF width introduces a sub- 
stantial uncertainty of the slope estimate. For the PSF wider then 
the best fit by 15 % we get b = 1 .744 + 0.207, while for the 15 % 
narrower, b = 1.421 + 0.267. One can summarize these results 
as follows. Statistical l<x limits around the best fit solution of 
b = 1.595 are defined as 1.365 < b < 1.825, while the combined 
statistical and systematic uncertainties are 1.154 < b < 1.951. 
The statistical uncertainties are shown in Fig. [4] with thin solid 
lines and the total uncertainties - with dashed lines. One should 
note that these "total" error estimates are highly conservative. 
They have been obtained by simple addition of the systematic 
and statistical errors assuming their highest "possible" values. 

Another source of the slope estimate error is related to the 
uncertainty of the number counts normalization. Any modifica- 
tion of the number of sources at the strong flux end affects the 
count slope as well. However, in the relevant flux range the re- 
alistic normalization uncertainties remain small (see Fig.|4| and 
do not affect considerably the slope uncertainty range. 

Variations of the conversion factor over the field of view also 
do not play significant role in our slope estimates. As pointed 
out in Sect. 13.11 one can incorporate these variations within the 
investigated area into uncertainty of the count normalization N . 
Assuming the differential number counts N(S ) = NS ~ b (with 
flux S in erg cm _2 s _1 ), a relationship between the count normal- 
ization N and the conversion factor cf is 

N = N-cf 1 - b . (14) 

Thus, for b — 1 .595 and the rms of the conversion factor at the 
level of ~ 5.9 % (Table the uncertainty of N amounts to ~ 
3.5 % and is small in comparison with the uncertainty of N itself. 

Substantial variations of the PSF width with the distance 
from the telescope axis limit the effectiveness of the NNST 
method. This is because the nearest neighbor distribution P(r) 
for the Eq. QT|is estimated using the actual data, i.e. actual dis- 
tribution of sources, while the probability P(r\k) is determined 
by averaging the model sources over the field of view. This fea- 
ture introduces additional uncertainty if the number of sources 
generating k w £ max is small, but should be of lesser importance 
in the deep Chandra exposures. 

4.3. The hard band 

A fraction of counts produced by discrete sources to all the 
counts in the 2-8 ke V band is substantially smaller than in the 
0.5 - 2 keV band. In effect, the NNST method is less effective in 
the H than in the S band, i.e. the slope estimates in the H band are 
subject to higher statistical uncertainties. Here we briefly sum- 
marize the results in the H band. 
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Fig. 5. Differential number counts in the 2 - 8keV band 
normalized to the Euclidean slope. The data points and the 
dashed line are construc ted using the 2 - lOkeV band from 
IGeorg akakis et al. (2008); solid lines - the number counts esti- 
mates based on the NNST: thick solid line - the best power-law 
fit, thin solid lines - l<x uncertainty range due to the counting 
statistics. 



although in the latter case the expected accuracy of our estimate 
might be quite low. 

The constraints obtained for the H band are not restrictive. 
This is because the contribution of the non X-ray counts in- 
creases with energy and the data become strongly contaminated 
by the particle background which effectively "dilutes" the counts 
concentrations produced by the sources. In the next paper some 
prospects to improve the S/N ratio above 2 keV will be explored. 
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As a referen ce data we used those published by 

Georgakakis et al. (2008). The counts and the analytic fit in the 
band 2 - lOkeV given in that paper were converted to the band 
2-8 keV assuming a power spectrum with the photon index 

r = -i.4. 

After the removal of strong sources, the area and the num- 
ber of counts used in the NNST amount to 94.4 sq. arcmin and 
101596, respectively. Applying the same procedure as in the S 
band, the best fit slope b = 1.676 + 0.340 was found using the 
12 randomized distributions. This is shown in Fig.[5]with a thick 
line and two thin lines. The sources represented by the NNST so- 
lution, i.e. sources producing 2 < k < 20 counts, contribute just 
1214+^™ photons or 1.19+°;|% of all the counts. Due to large 
statistical uncertainty in this band the NNST does not provide 
tight limits on the number counts and we have not plotted here 
lines representing the range of systematic uncertainties. 



Appendix A: Impact of the exposure fluctuations on 
the NNST 

Effects of the small variations of the exposure over the field of 
view on the probability distributions P(r) and P(r\l) in EgfTTIare 
investigated. 

The observed distribution of counts is considered here as the 
realization of a Poisson process. According to this premise, the 
probability of getting an event in the x — y plane of the field of 
view (fov) is described by a smooth function p{x,y). The p(x,y) 
amplitude is proportional to the exposure map (Sect. I3.lt . One 
can define the "true" probability density p (x,y) which would 
describe the distribution of counts expected for the perfect in- 
strument with the flat exposure map and - consequently - a con- 
stant conversion factor. Thus, the p distribution is related to the 
p and the fluctuations of the exposure map, EM: 



5. Conclusions 

We have estimated the source number counts in the S band down 
to ~ 2 • 10~ 17 cgs using the merged data with the integrated ex- 
posure time of ~465 ks. This flux level is below the standard de- 
tection threshold fo r individual sou rces in the deepest Chandra 
exposures of 2 Ms dKim et al.Ll2007l) . 

Our slope estimate below S = 10~ 16 cgs fits perfectly the 
actual discrete source counts determined using such deep obser- 
vations. It shows the NNST potential as an effective tool in the 
investigation of the extremely weak source population. In the 
second paper of this series we plan to apply the NNST method 
to the Chandra Deep Fields. With a 2 Ms exposure the NNST 
will allow to assess number counts down to ~4 ■ 10~ 18 cgs in the 
0.5 - 2keV band and to ~ 2 ■ 10~ 17 cgs in the 2 - 8keV band, 



p(x,y) -p (x,y) 



EM(ij) 
EM 



(A.l) 



where EM D is the exposure map of the perfect instrument. A nat- 
ural normalization of the exposure map is assumed: (EM(i, y)) = 
EM , where the brackets (...) denote the averaging over the fov. 
Since the distributions p a and EM are independent, (p) = (p a )- 
To quantify the amplitude of the EM fluctuations with a single 
parameter s, we define a function f(x,y): 



EM(x,y) 

x(x, y) = = 1 + e t(x, y) , 



(A.2) 



in such a way that (/) = and cr 2 = (f 2 ) = 1. We also 
have (x) = 1 and cr^ = s, where cr* is the rms of x. Since 
= o"em/EM , the data in the Table [2] show the s parameter 
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characterizing the present observational material remains small; 
in particular, it is equal to 0.059 and 0.062 for energy bands S 
and H, respectively. 

The expected nearest neighbor probability distributions 
P(r\l) and P(r) are related to the count distribution p in a fol- 
lowing way: 

P{r\\) = {e-" r2p{x ' y) ) , (A.3) 

P(r) = ^-(p(x,y)-e-'" 1 ^). (A.4) 
W 

Substituting Eqs. IA. 1 l and lA.2l into Eqs. lA.3l and lA.4l one can ex- 
pand the distributions P(r\ 1) and P(r) in powers of the parameter 
s. Since the distributions of f(x,y) and p a (x,y) are uncorrelated, 
one finds that the linear term in e vanishes. 



